Transparent runtime parallelization of the R scripting language

نویسندگان

  • Jiangtian Li
  • Xiaosong Ma
  • Srikanth B. Yoginath
  • Guruprasad Kora
  • Nagiza F. Samatova
چکیده

Scripting languages such as R and Matlab are widely used in scientific data processing. As the data volume and the complexity of analysis tasks both grow, sequential data processing using these tools often becomes the bottleneck in scientific workflows. We describe pR, a runtime framework for automatic and transparent parallelization of the popular R language used in statistical computing. Recognizing scripting languages’ interpreted nature and data analysis codes’ use pattern, we propose several novel techniques: (1) applying parallelizing compiler technology to runtime, whole-program dependence analysis of scripting languages, (2) incremental code analysis assisted with evaluation results, and (3) runtime parallelization of file accesses. Our framework does not require any modification to either the source code or the underlying R implementation. Experimental results demonstrate that pR can exploit both task and data parallelism transparently and overall has better performance as well as scalability compared to an existing parallel R package that requires code modification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

StarFlow: A Script-Centric Data Analysis Environment

We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions ena...

متن کامل

Oberon Script: A Lightweight Compiler and Runtime System for the Web

Oberon Script is a scripting language and runtime system for building interactive Web Client applications. It is based on the Oberon programming language and consists of a compiler that translates Oberon Script at load-time into JavaScript code, and a small runtime system that detects and compiles script sections written in Oberon Script.

متن کامل

Safe Parallel Programming in an Interpreted Language

Parallel programming is increasingly important with the advent of multicore processors. However, modern software is difficult to parallelize because of the high degree of modularization. It is unclear whether a piece of code is parallel if it calls other functions. Dynamic languages such as Ruby, Python, and Matlab represent modularization to the extreme. A program, also known as a script, requ...

متن کامل

Programming Network Components Using NetPebbles: An Early Report

A network-centric application developer faces a number of challenges, including distributed program design, e cient remote object access, software reuse, and program deployment issues. This level of complexity hinders the developer's ability to focus on the application logic. NetPebbles removes this complexity from the developer through a network-component based scripting environment where remo...

متن کامل

Scripting For Java

Tcl has been initially developed as an embeddable command language to provide what we now call ”scripting” to complex applications. The ”scripting” or ”high level language” approach to provide control to applications from command lines, configurations files or ”macros” has been very successful and a major winning case for Tcl. In the last six years, Java appeared as a programming language and r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 71  شماره 

صفحات  -

تاریخ انتشار 2011